Incorporating Spatial Similarity into Ensemble Clustering

نویسندگان

  • M. Hidayath Ansari
  • Nathanael Fillmore
  • Michael H. Coen
چکیده

This paper addresses a fundamental problem in ensemble clustering – namely, how should one compare the similarity of two clusterings? The vast majority of prior techniques for comparing clusterings are entirely partitional, i.e., they examine assignments of points in set theoretic terms after they have been partitioned. In doing so, these methods ignore the spatial layout of the data, disregarding the fact that this information is responsible for generating the clusterings to begin with. In this paper, we demonstrate the importance of incorporating spatial information into forming ensemble clusterings. We investigate the use of a recently proposed measure, called CDistance, which uses both spatial and partitional information to compare clusterings. We demonstrate that CDistance can be applied in a wellmotivated way to four areas fundamental to existing ensemble techniques: the correspondence problem, subsampling, stability analysis and diversity detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data

The emergence of high-dimensional data in various areas has brought new challenges to the ensemble clustering research. To deal with the curse of dimensionality, considerable efforts in ensemble clustering have been made by incorporating various subspace-based techniques. Besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimila...

متن کامل

Spectral Clustering Ensemble and Unsupervised Clustering for Land cover Identification in High Spatial Resolution Satellite Images

Unsupervised clustering plays a dominant role in detailed landcover identification specifically in agricultural and environmental monitoring of high spatial resolution remote sensing images. A method called Approximate Spectral Clustering enables spectral partitioning for big datasets to extract clusters with different characteristic without a parametric model. Various information types are use...

متن کامل

From Subspaces to Metrics and Beyond: Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data

The emergence of high-dimensional data in various areas has brought new challenges to the ensemble clustering research. To deal with the curse of dimensionality, considerable efforts in ensemble clustering have been made by incorporating various subspace-based techniques. Besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimila...

متن کامل

Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...

متن کامل

Improving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering

Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010